Concepedia

Concept

lifelong reinforcement learning

Parents

Children

1.4K

Publications

86.9K

Citations

3.9K

Authors

943

Institutions

Continual Reinforcement Learning Transfer

2013 - 2019

During this era, reinforcement learning research increasingly focused on learning across sequences of tasks by integrating modular and hierarchical architectures, expert gating, and progressive task curricula. These patterns supported transfer of knowledge and reduced catastrophic forgetting as agents encountered varied environments and objectives; emphasis fell on memory-efficient replay strategies, task-aware memory management, and structured knowledge reuse across tasks. Exploration and representation learning were enhanced through intrinsic motivation, stochastic perturbations, and entropy-regularized objectives, driving robust and diverse behaviors in progressively richer environments, often with cross-domain challenges. Researchers embraced scalable architectures such as modular networks, networks of experts, and differentiable planning to enable seamless transfer and growth, while cross-domain transfer and successor-feature formalisms helped generalize policies across domains.

Continual/lifelong RL builds sequential task capabilities by modular networks, expert gating, and curriculum-style task progression, enabling knowledge to transfer and avoid forgetting across tasks [6], [5], [10], [19].

Optimizing memory usage in RL via prioritized sampling [1], curated replay databases [20], and hierarchical replay [16] to improve sample efficiency and knowledge reuse across tasks [17].

Exploration enhancements combine intrinsic motivation [2], stochastic weight perturbations [18], and entropy-based objectives [4] to drive diverse behaviors and more reliable learning in RL, with environmental challenges highlighted by rich environments [11].

Hierarchical and modular architectures enable scalable transfer across tasks via temporal abstraction and planning modules [2], progressive networks [6], network-of-experts [5], and differentiable planning [9], with multi-domain dialogue [8] illustrating cross-domain application.

Cross-domain transfer is formalized with successor features and generalized policy improvement [13], zero-shot transfer from task features [3], and cross-domain lifelong transfer RL [19], with hierarchical replay supporting transfer [16].

Continual Lifelong Reinforcement Learning

2020 - 2023